home *** CD-ROM | disk | FTP | other *** search
Text File | 1993-03-19 | 75.7 KB | 1,819 lines |
-
- CHAPTER FOUR
-
- THE STANDARDS OF CD-ROM
-
-
- CD-ROM DATA EXCHANGE STANDARD (DXS) OVERVIEW
-
- Peter Ciuffetti
- SilverPlatter Information Inc.
-
-
- The Data Exchange Standard (DXS) document set (of which
- this document forms a part) defines a general purpose
- mechanism for standardizing information access for a
- wide variety of information sources and delivery
- platforms. The DXS document set identifies the
- architecture that allows information retrieval system
- developers to build systems which are user interface
- independent. The beneficiaries in an environment where
- information access is standardized are ultimately
- researchers themselves, but the information industry
- and society as a whole will also benefit as a result.
- This is the first live version of this document.
-
-
- THE NEED FOR A DATA EXCHANGE STANDARD
-
- The requirement for interface independence has been
- prompted by the success of the CD-ROM industry and has
- been voiced by the consumers of those products emerging
- from that industry. The success of the CD-ROM industry
- results from the physical standardization of the CD-ROM
- disc (by Philips and Sony) and international acceptance
- of the directory structure on the disc (ISO 9660).
- These standards have simplified the creation of
- distributable information products and comprehensive
- catalogs of data sources now exist in every major
- discipline.
- Many organizations large and small have built a
- collection of these products and have become
- overwhelmed by the variety of computer systems which
- each of their constituents must learn in order to
- extract the imbedded knowledge on each disc. The
- diversity of these systems has created a barrier to the
- continued growth and acceptance of distributable
- information products. This variety, although emphasized
- by the success of the CD-ROM industry, encompasses all
- information products, whether distributable or not,
- since researchers desire access to each electronic
- resource at their disposal.
- All researchers, from casual to full-time, require that
- access to information be universally simplified,
- regardless of its source. Were universal simplification
- readily achievable, that would be the subject of this
- document set. Given that it is not, due to the
- diversity of researchers themselves, the architecture
- of DXS strives for a compromise which is readily
- achievable. DXS separates the window into the
- information from the information itself and leaves the
- selection of the window to the researcher in a way that
- achieves interoperability and interface independence.
-
-
- CLIENT-SERVER ARCHITECTURE
-
- The architecture used by DXS to achieve
- interoperability is called "client-server." Many
- computer systems successfully use a client-server
- strategy to simplify or standardize their
- functionality. To describe this approach, let's
- identify the three major elements in any simple
- information retrieval system. They are; the database,
- the software program used to query that database, and
- the computer that the software program runs on. In
- today's predominantly non-client-server products, the
- database and the software come from the same vendor and
- run only on a few types of computer. Due to lack of
- standards, it is typically not possible for a
- researcher to select one of these three elements and
- replace it with another that he prefers. In most
- commercial products, these elements are so tightly
- bound that the researcher has to accept them as a
- single package. To create some degree of freedom, the
- client-server architecture splits the software program
- in half. One half encompasses all of the functions
- necessary for query formulation and display. Typically
- this is called the "client" or "user interface." The
- other half encompasses all of the functions necessary
- for query evaluation and data access. Usually this is
- called the "server" or "retrieval engine." As a result
- of the software split, there are now a total of four
- basic elements; the database, the client program, the
- server program and the computer. The glue that holds
- the client and server together is a messaging system,
- analogous to electronic mail, that the client and
- server use to pass queries and results back and forth.
- It is this messaging system which is the target of DXS
- standardization. By defining the syntax and semantics
- of the messages passed between the client and the
- server, interoperability is achieved between clients
- and servers supplied by different vendors; this in turn
- allows the researcher to select the user interface that
- he prefers. Other approaches besides client-server
- could be used to achieve interoperability. These
- include standardizing the database file structures or
- standardizing user interfaces. These are perhaps more
- appropriate for niche communities which may be prepared
- to accept limitations in adopting future file structure
- and user interface technologies in favour of standards
- today. Client-server architecture serves as a more
- flexible solution which allows researchers to choose
- any available DXS-conformant user interface while
- preserving the developer's freedom to change
- technologies.
-
-
- INTEROPERABILITY
-
- Under DXS, interoperability is defined as "the ability
- for any conforming DXS client to query any conforming
- DXS server with which it has the ability to
- communicate." More specifically, a researcher can
- select any DXS compliant client program, from any
- creator of such a program, and use it to identify some
- meaningful set of the information contained in any DXS
- compliant database and display that information in some
- form when those programs support the same operating
- system or network. In addition, the researcher is able
- to switch from searching one conforming database from
- one vendor to another conforming database from a
- different vendor without having to change, or even
- leave, the client program that they were previously
- using. Note that underlying this definition is an
- acknowledgment that there is a wide variety of
- information types and data-specific functions used for
- their access and display. As a result of this variety
- and in the interest of practicality, some data elements
- unique to a given information product may not be
- retrievable or displayable in an optimized form by all
- clients. To efficiently access and display these unique
- data elements may require using a specific client which
- understands this data. What is guaranteed by DXS
- compliance is not universal access to all types of data
- so much as usable access to all types of data. As long
- as the heart of an information product can be expressed
- using the DXS model, it will be practical to access
- that database with any DXS client. Therefore, when
- confronted with the need to access a database
- containing one or more unique data elements,
- researchers will have the opportunity to either access
- the bulk of the database with any DXS client or access
- the whole database with the database vendor's specific
- (DXS) client. Which choice they make will depend on
- their desire to access the database in its full glory
- weighed against their reluctance to learn to use a new
- client (and hence a new user interface).
-
-
- ORGANIZATION OF THE DXS DOCUMENT SET
-
- The Data Exchange Standard consists of a collection of
- four related documents:
-
- DXS Documents CD-ROM Data Exchange Standard Overview
- and Glossary DXS/SPEC/L1 December, 1991
-
- CD-ROM Data Exchange Standard Database Server Access
- Protocol DXS/DSAP/L1 December, 1991
-
- CD-ROM Data Exchange Standard Client-Server Transfer
- Syntax DXS/CSTS/L1 December, 1991
-
- CD-ROM Data Exchange Standard Platform Dependent
- Implementation Details DXS/PLAT/L1 December, 1991
-
- The first document, this one, introduces DXS concepts
- and describes the scope and functionality of DXS. The
- target audience is any individual interested in
- information retrieval issues. The next three documents
- are targeted for retrieval system designers and
- implementers. Familiarity with system design and
- programming issues will benefit readers of these
- documents. The Database Server Access Protocol is the
- heart of the DXS standard. The DSAP specifies the set
- of messages that clients and servers can pass to each
- other. The variety and functionality of these messages
- represent the richness of the DXS standard. The
- messages included specify how clients discover
- information about servers and databases, how queries
- are expressed and how results are returned. Included in
- the DSAP are specifications which accommodate
- extensibility to the message set. The Client-Server
- Transfer Syntax is a general purpose specification
- which describes the format of the messages passed
- between the client and the server. The specification
- includes features which will allow a variety of
- machine-to-machine communication protocols and program
- to program interprocess messaging strategies to be used
- as the conduit through which requests and responses are
- passed. The Platform Specific Implementation Details
- documents how clients and servers are loaded, how they
- find each other, and how requests and responses are
- exchanged under various operating system environments.
- This version of DXS identifies solutions for both
- networked and non-networked environments including MS-
- DOS, Windows, Apple Macintosh, POSIX, TCP/IP, Novell
- IPX and NetBIOS.
-
-
- DXS SCOPE
-
- The scope of any interface independent retrieval
- protocol helps define what type of information sources
- can be easily accommodated and what types of hardware
- platforms are possible as hosts for the resulting
- retrieval system. There is a dynamic between the
- richness of the protocol and the minimum hardware
- requirements that drives the hardware costs up as the
- number of included functions and data types grows. To
- support a larger variety of information types the
- number of functions must be expanded. A balance must be
- crafted that promises interoperability among the most
- popular variety of information sources on the most
- popular variety of platforms. Confounding this
- selection are three facts. Firstly, commercial products
- will not become available for some months or years
- after the original decisions are made. Secondly,
- popularity will continue to evolve as products emerge.
- Thirdly, the overheads of conforming to a standard and
- having two programs where there was once one suggests
- that non-standard products that exist today cannot
- duplicate their functionality and conform to a standard
- without raising the minimum hardware requirements. Any
- strategy which targets todays platforms but does not
- achieve a critical mass of products within three years
- is doomed to short life. The DXS specification
- therefore carefully expresses its scope in the
- following paragraphs. The targeted scope balances a
- feature set that will accommodate a large variety of
- database functions and will fit well on computers which
- support a multi-tasking operating system. This may
- appear as a high-end system today but will become de
- rigueur as products emerge. The scope is expressed as a
- categorization of the information sources which would
- be accommodated, the platforms which could host the
- standard and the functional strategy selected.
-
- DATA MODEL
-
- This version of DXS supports read-only information
- products. The server is asked to query what the client
- considers a static information source. Updates to the
- data are outside the scope of DXS and may happen in any
- way the information vendor chooses. Information
- products which require client-originated updates may do
- so using vendor extensions. Note that this does not
- mean that DXS is limited to CD-ROM. The media type is
- not specified by DXS; any direct access device would be
- sufficient.
- All databases which are mostly textual or numeric in
- nature can be queried by DXS clients. Vendor extensions
- may be employed to return graphic data or spatially
- oriented data. The organization of the database
- controlled by the server uses the model of a set of
- records of arbitrary size. Optionally, within each
- record is one or more fields of arbitrary size. Within
- each field are one or more sentences. Within each
- sentence are one or more words. Words consist of one or
- more characters using editing rules at the server's
- discretion. There is support for a hierarchical
- organization of records which can be accessed through a
- table of contents. There is no formal schema. Servers
- assist clients by providing field information, index
- information and query evaluation strategies upon
- request by the client.
-
-
- TARGET PLATFORMS
-
- Both standalone and networked computers are supported
- by DXS. A distinction is made between a non-networked
- server, called a local server, and a networked server.
- This is mainly because, in the former case, the
- database may be removable and the researcher may change
- it when he or she chooses, whereas in the latter case
- this is a supervisory function. There are also other
- technical considerations which merit a distinction. A
- multi-tasking operating system is not required but
- asynchronous operation of the client and server is
- assumed by DXS. This means that the client and server
- are, or appear to be, two independently functioning
- programs, whether they are running in the same machine
- or not. Therefore, implementation of a client and local
- server under a single-tasking operating system, such as
- MS-DOS, will be complicated by the need to emulate the
- asynchronous aspect of the protocol. No such problem
- applies to network clients or clients in a multi-
- tasking operating system. The asynchronous operation of
- the client and server is an important element of a
- flexible architecture.
-
-
- FUNCTIONAL MODEL
-
- DXS compliant products can take one of two forms. A
- client program can be created for a given operating
- system on a given computer. Or, a server program,
- coupled with one or more databases can be supplied for
- a given operating system or network on a given
- computer. Commercial products may typically provide
- several of both clients and servers, a matching set for
- each supported operating environment. Server programs
- are only responsible for understanding the proprietary
- file structures, data elements and index strategies
- used in the databases published by the server vendor.
- It is not required that a server program understand the
- file structures of databases from any other vendor.
- Client programs are responsible (amongst other things)
- for mapping end-user search requests into DXS-compliant
- queries. The client transmits these queries to the
- server using the mechanism specified for the operating
- system or network on which it is running. Since the
- transfer syntax lets servers choose the optimum format
- for returned data elements, clients must be prepared to
- accept any legal return type for the requested data.
- Clients can optionally support either networked
- servers, local servers or both.
- Since DXS benefits are intended to be end-user
- oriented, the client controls the session. In order to
- query a DXS database, the end user first starts his
- client program. It is then the client's responsibility
- to locate the available servers. For fully functioning
- clients which understand local and network servers,
- some servers will be available via the network, whilst
- local servers will be available via a table of
- installed servers. As the client discovers each
- available server, a list if database titles will be
- constructed resulting in a menu of information sources
- at the user's disposal. The client must be able to
- establish connections to network servers and load and
- unload the appropriate local servers as the user
- switches from one database to another.
-
- DXS FEATURES
-
- Given that DXS is an agreement between two computer
- programs, DXS specifies the messages and their contents
- in a machine-friendly fashion. To simplify the creation
- and evolution of clients and servers, the protocol
- language does not require a grammar. To maintain the
- highest possible degree of portability and
- interoperability between different operating systems
- and networks, DXS does not use remote procedure calls
- or any other form of direct interprocess binding.
- DXS specifies a standard information file created
- during local server installation which simplifies the
- creation of menus and switching between local servers
- by clients. Servers accessed over networks can be
- easily located and functions are included that allow
- the client to discover the attributes of each server.
- The implementation details specify how local servers
- are loaded and unloaded as the user moves from one
- database to another. Included is support for removable
- media for local information sources. Login security is
- supported if required for a specific information
- source.
- Clients can discover the field names, capabilities and
- limitations of each database they query. Included among
- the available returned information are provisions for
- database specific end-user help text. This allows a
- client to display a semantic description of a database
- and its fields supplied by the server. Where supported
- by the server, indexes can be accessed conveniently and
- efficiently for display and browsing purposes.
- A full complement of boolean operators are specified
- complete with a variety of pattern matching
- expressions. Mechanisms exist to permit servers to
- support only a sub- set of these. Search progress and
- search interruption are also supported.
- Returned data elements may have flexible markup codes
- imbedded by servers to allow enhanced display or
- printing by clients. Servers may sort retrieved records
- or rank them by relevance when so requested by a
- client. "Set hits" and "get hits" requests allow
- clients to build up sets of records selected by the
- user for future reference. Hierarchical information
- sources are supported through table of contents access
- features. Unfielded data is also supported for search,
- display and browse.
- Developers may freely add new vendor-unique messages to
- the protocol or new data elements to existing messages
- with no impact on other vendor's clients or servers.
- They do not forfeit conformance by doing so, but they
- limit the capabilities of their products when used in
- conjunction with other vendors' clients or servers.
- The transfer syntax specified optimizes DXS for use in
- both networked and non-networked environments. Although
- DXS is a message-oriented protocol, the transfer syntax
- allows byte-stream-oriented communication strategies.
- Details are given that allow developers from different
- organizations to build compliant systems on specific
- platforms without private agreements. Developers that
- follow these guidelines have a much greater chance of
- achieving true interoperability and interface
- independence than if these elements were left
- unspecified. These are only some of the features in the
- DXS protocol, transfer syntax and implementation
- details. For more detailed descriptions of these
- features and others, please refer to the respective
- documents, available from Peter Ciuffetti,
- SilverPlatter Information, 100 River Ridge Drive,
- Norwood MA 02062-5026; 617/769-2599; fax 617/769-8763.
-
-
- CD-ROM STANDARDS IN THE AEROSPACE INDUSTRY
- PROGRESS AND PROMISE
-
- Don M. Goldman
- Pratt and Whitney, East Hartford CT
-
- Neil R. Shapiro
- Scilab Inc., Niskayuna NY
-
-
- Within the airline industry, standards for manufacturers'
- technical data are developed and maintained under the
- auspices of the Air Transport Association (ATA), in
- conjunction with the Aerospace Industries Association
- (AIA). ATA/AIA Committees meet on a regular basis to
- address standards for technical manuals, and to prepare
- for future requirements. The results are published and
- maintained in ATA Specification 100, which is updated
- once a year.
- Early in the 1980's, it became evident that the ATA
- would need to modernize Specification 100 to address
- emerging issues associated with electronic technology for
- access and delivery. Detailed planning for modernization
- was initiated in 1983 with organizational changes and
- initiatives targeted to meet anticipated requirements.
- Two key areas addressed included: 1) interchange of data
- in electronic form for reauthoring and 2) delivery of
- data in an electronic presentation format.
- Four subcommittees, under the coordinating committee
- ATA/AIA 89-9, are working to resolve standardization
- issues in these areas, as shown in Figure 1. The primary
- activities of the subcommittees are summarized below:
-
- %g GOL01.PCX
-
-
- 89-9A - Working on text interchange standards.
- SGML Document Type Definitions (DTD's) have
- been completed for Airframe Maintenance and
- Engine Manuals. DTD's for Airframe and Engine
- Illustrated Parts Catalogs are in their final
- draft. A coordination team is gathering
- requirements and establishing general
- guidelines for additional manuals.
-
- 89-9B - Working on graphics interchange
- standards. The subcommittee selected industry-
- wide standard formats TIFF (raster) and CGM
- (vector), but needed to specialize them via an
- ATA/AIA application profile. The subcommittee
- is currently looking into ATA requirements for
- intelligent graphics (e.g., graphic to graphic
- cross-references).
-
- 89-9C - Working on requirements and standards
- for advanced retrieval systems for manuals.
- Current focus is on CD-ROM; in particular,
- standards to make CD-ROM publishing
- independent of end-user access tools.
-
- 89-9D - Developing functional requirements for
- networked information systems.
-
-
- CD-ROM STANDARDS
-
- This article reports on the progress of the 89-9C
- subcommittee in developing standards for delivery of data
- in an electronic presentation format on CD-ROM. It
- provides an update on the work reported here in June of
- 1989 (Shapiro, 1989).
- The work of the 89-9C committee is based on
- experience derived from field trials of CD-ROM
- technology. As early as the Spring of 1987, British
- Airways began testing CD-ROM as a new publishing medium
- for airframe maintenance manuals. During the past four
- years, American Airlines, CFMI (a joint company of
- Snecma, France and GE), GE Aircraft Engines, Pratt &
- Whitney, and Rolls Royce have also conducted field trials
- with CD-ROM.
- Although these trials proved that CD-ROM technology
- could be an effective tool for airlines to improve
- productivity, they identified a fundamental limitation in
- existing CD-ROM standards. The key problem is evidenced
- when an airline receives technical manuals on CD-ROM from
- more than one supplier (e.g., CFMI, GE, Pratt & Whitney,
- Rolls Royce). If the suppliers used different CD-ROM
- publishing assumptions (e.g., file format or data
- indexing technique), each CD-ROM would need to be
- supplied with its own retrieval software. Thus the
- airline could be faced with as many as four different
- user interfaces.
- The 89-9C Subcommittee's solution to this problem
- was to develop an open systems framework where the
- publishing of a disc was independent of the end-user
- access tools. The ATA refers to this separation as
- "Software Independent CD-ROMs". The framework divides the
- traditionally monolithic CD-ROM access software into a
- "server" and a "client". The "server" is delivered with
- each disk and provides, to other software programs, a
- standard method of access to the data on the disc. The
- "client" program acts as the interface between the user
- and the server (the user interface).
- This architecture allows airframe, engine, and
- component manufacturers to publish CD-ROMs with an
- optional user interface. The CD-ROMs, although ready for
- full-interactive retrieval, can be produced without any
- consideration of the end-user interface. An end-user
- company may work with a user interface supplied with the
- disc, or may select (or develop) a different user
- interface according to their requirements. This
- flexibility is accomplished by adhering to the ATA's
- suite of Advanced Retrieval Standards.
-
-
- CD-ROM STANDARDS SUITE
-
- The ATA Advanced Retrieval Standards are a set of
- documents in ATA Specification 100 Appendix 1, which
- provide a complete retrieval system specification (in
- Rev. 29, these documents were included in Appendix 1,
- Part 2, an informational section for standards in
- review). The framework is provided by 89-9C.CDROMPROFILE,
- which defines a standard application profile for CD-ROM
- systems, incorporating both broad and ATA specific
- standards (See Figure 2). It requires that the system is
- divided into a client and a server, and that
- communication between client and server follow the
- protocols and formats laid out in the following
- specifications:
-
- %g GOL02.PCX
-
-
- 89-9C.COMMWIN, "Communication Protocol for
- Microsoft Windows", defines the end-to-end
- communication protocol (transport service)
- between client and server under Microsoft
- Windows. This is the only specification in the
- set which is platform dependent. It is
- expected that protocols for other platforms
- (e.g., Unix and the Macintosh) will be added
- in 1992.
-
- 89.9C.SFQL2, "Structured Full-Text Query
- Language" is the standard interprocess request
- language between the client and server. SFQL
- provides both a standard method of data
- request (e.g., a query), and a standard method
- of data return. SFQL is based on the ANSI/ISO
- SQL standard for relational database access,
- with extensions for full-text and
- client/server operation (see Shapiro et al.,
- 1991).
-
- 89-9B.GRAPHICS, "Raster and Vector Formats",
- defines the format of illustration data
- returned from the server to the client (end-
- user application) software. It includes
- specializations of ANSI standard CGM (ANSI
- X3.122-1986) for vector images, and the de
- facto TIFF standard (Aldus Corporation) for
- raster images. The standard was authored by
- the 89-9B working group to support graphics
- interchange.
-
- 89-9A.DTD, "Document Type Definitions (SGML)"
- defines the format (logical markup) of text
- data as it is seen by the end-user application
- software. Note that since a logical markup is
- used, the client may display and manipulate
- text as it chooses. This standard was authored
- by the 89-9A working group to support text
- interchange.
-
- 89-9C.SCHEMAS, "Document Schemas" defines the
- mapping of manuals to the current database
- model (based on SFQL, an extension of the
- relational model).
-
- 89-9C.FUNCREQ, "Functional Requirements",
- defines the general aim of the standardization
- process by setting minimum functional
- requirements for advanced retrieval systems.
- This establishes a base level of functionality
- which must be supported by the standards and
- compliant systems.
-
-
- PROTOTYPE SYSTEMS
-
- Proof of concept prototypes have been developed by GE
- Aircraft Engines and Aerospatiale (the Toulouse France
- based airframe manufacturer) to demonstrate that software
- independence was possible.
- GE and Aerospatiale each selected a commercial text-
- retrieval system and built SFQL servers around them. GE
- based its server on the KnowledgeSet KRS system and
- Aerospatiale built its server around the Fulcrum
- Technologies Ful/Text system.
- GE and Aerospatiale each completed their system by
- building a client program which could access their own
- server via SFQL.
- Software independence was demonstrated when, at an
- ATA meeting in February, 1990, GE's client could access
- both the Aerospatiale (Fulcrum) and the GE (KRS) discs
- transparently. Likewise, Aerospatiale's client could
- access both discs.
-
-
- STANDARDS VALIDATION
-
- The proof-of-concept prototypes only demonstrated that
- part of the ATA 89-9C framework (SFQL and COMMWIN) was
- valid. It was not practical to develop a full working
- model to test all components of the framework. Further,
- while the prototypes were being developed, a committee of
- volunteers from the ATA/AIA and the text retrieval
- industry were working to generalize and enhance the SFQL
- specification.
- In order to validate the individual specifications
- prior to incorporation into Specification 100, the 89-9C
- conducted a broad-based review. The specifications were
- sent to technical experts both in and outside of the
- aerospace industry, with a survey form to help focus the
- task.
- Based on the response to the survey, 89-9C went
- through another revision cycle for all the
- specifications. Several, including 89-9C.SFQL2, 89-
- 9C.COMMWIN and 89-9B.GRAPHICS, were accepted for
- incorporation as standards within ATA Specification 100,
- Revision 30, which is expected to be published in early
- 1992.
- The remaining specifications required more extensive
- revision, and hence will remain as informational
- attachments to Specification 100, pending further
- validation efforts.
-
-
- NATIONS & INTERNATIONAL ACCREDITATION
-
- The ATA/AIA has recognized that the CD-ROM standards they
- are developing are applicable to many other industries
- using CD-ROM. Further, standards such as these require a
- broad base of support. Thus, 89-9C has taken steps to
- communicate the ATA/AIA standards efforts outside the
- aerospace community and work with national and
- international committees to get accredited standards in
- place.
- At the national level, an ATA/AIA representative has
- been asked to participate on a NISO (National Information
- Standards Organization) committee which will be reviewing
- SFQL as a candidate for a retrieval system
- interoperability standard. Further, the IEEE Computer
- Society is reviewing a Project Authorization Request to
- establish an SFQL-based standard as part of a broader set
- of CD-ROM standards. Finally, at the international level,
- ISO (International Standards Organization) has requested
- ATA/AIA participation on a committee addressing full-text
- retrieval.
-
-
- LOOKING TO THE FUTURE
-
- In order to ensure the most effective application of
- technology, the ATA/AIA will continue to pursue the
- development and maintenance of standards. Following
- completion of the PC-based CD-ROM standards discussed
- earlier, other platforms for CD-ROM systems (i.e.,
- Macintosh and Unix) will be included in the CD-ROM
- profile. Beyond this, there is interest in developing
- standard application profiles for other media (i.e., WORM
- and Rewriteable).
- As the airline industry moves through the next
- decade, efforts to improve productivity by employing
- advanced retrieval technology will continue. British
- Airways and American Airlines have already installed
- production CD-ROM systems. Recently, GE Aircraft Engines
- and Boeing announced that they are moving forward with
- RFPs for standards compliant CD-ROM systems.
- Development and maintenance of standards for advanced
- retrieval technology has been and will continue to be a
- difficult and demanding endeavor. It will require hard
- work by dedicated and committed individuals. Thus far,
- the ATA/AIA has been up to the challenge.
-
- REFERENCES
-
- ATA (1990). ATA Specification 100, Appendix 1:
- Digital Data Standards, Published by the Air Transport
- Association, 1709 New York Avenue Northwest, Washington
- D.C. 20006.
- Shapiro, N.R., Diamantopoulos, E., and Cotton, P.
- (April, 1991). CD-ROM Disc Interchangeability Standards:
- Beyond ISO 9660 with the Structured Full-text Query
- Language. ATA/AIA 89-9C Monograph. (Available from
- Scilab.)
- Shapiro, N.R. (June, 1989). Electronic Document
- Delivery in the Aerospace Industry: Toward Standards.
- EPSIG News, Vol. 2, pp. 11-13.
-
-
-
- Reprinted from EPSIG News, with permission
- from the Electronic Publishing Special
- Interest Group (EPSIG) c/o OCLC, 6565 Frantz
- Road, Dublin OH 43017.
-
-
- STANDARD FOR THE EXCHANGE OF
- DIGITAL INFORMATION ON CD-ROM
- CD-ROM READ-ONLY DATA EXCHANGE STANDARD (CD-RDx)
-
- Commissioned by
- The Information Handling Committee
- Director of Central Intelligence
- Intelligence Community Staff
- Washington, DC 20505
-
-
- Questions concerning this standard should be directed to:
-
- Chairman
- DCI Intelligence Information Handling Committee
- Intelligence Community Staff
- Washington, D.C., 20505
-
- who will issue any interpretations or succeeding
- amendments or modifications, thereto, as may be required.
-
-
- 1. CURRENT STATE OF CD-ROM
-
- CD-ROM is the first efficient and economical
- publishing medium for large quantities of machine-
- readable data. The data on a CD-ROM cannot be added to,
- deleted, or changed. These facts and their implications
- have combined with the conditions of the microcomputing
- marketplace to create a situation that is both
- unnecessarily complicated and inconvenient for both
- CD-ROM publishers and users. The specific sources of
- these complications and inconveniences are readily
- identifiable.
-
- 1.1 The Microcomputer Market
-
- Microcomputers are built principally around two
- things: the microchip for the CPU (with its related data-
- handling architecture) and the operating system.
- Currently, the installed base of microcomputers in North
- America is composed mainly of the following system types:
-
- 1. IBM PC/PS2 Standard Systems
- 2. Macintosh Systems
- 3. Super-Microcomputer Systems (e.g. Sun, DEC
- etc.)
- 4. Apple Systems
- A few other system types could be added, if one were to
- include special use systems and the world market.
- However, in terms of the current market for CD-ROM
- products, these five systems constitute virtually the
- entire universe.
-
- The operating systems which dominate the
- microcomputer market are equally few:
-
- 1. PC-DOS/MS-DOS
- 2. DOS under MS Windows
- 3. Macintosh OS
- 4. UNIX/XENIX
- 5. OS2
- 6. Apple OS
-
- Again, there are a few others, but they occupy either
- specialty niches or a negligible share of the current
- market.
-
- 1.2 CD-ROM Drives and Drivers
-
- There are several dozen manufactures of CD-ROM
- drives competing for the current market. These
- manufacturers have a special problem. A microcomputer
- user cannot simply buy a CD-ROM drive and attach it to
- his or her machine; special device drivers must also be
- purchased and installed. Those six operating systems
- listed above each need to have special device drivers
- added to them in order to allow a microcomputer to access
- a CD-ROM drive. So each manufacturer currently has to
- choose those segments of the market with which to make
- their drives compatible. Most drive manufacturers have
- created separate device drivers for the main operating
- systems in the microcomputer market. However, things are
- not as chaotic as they could be because certain critical
- CD-ROM standards have already been implemented.
-
- 1.3 Current CD-ROM Standards
-
- Early in the development of CD-ROM, standards for
- defining logical blocks of data on a CD-ROM were
- established (ISO/IEC Standard 10149). This standard made
- it theoretically possible for each CD-ROM drive complying
- with this standard to read data from discs complying with
- this standard.
- The next advance in CD-ROM standards was a two-step
- process to standardize the format of the data on a
- CD-ROM. This began with the "High Sierra" proposed
- standard and developed into the International Standard
- for the CD-ROM Volume and File Format (ISO 9660). This
- standard enabled the same discs to be accessed on
- different brands of CD-ROM drives on different
- microcomputers, provided that these microcomputers were
- operating under the same operating system. This is where
- the CD-ROM industry is today, and this last proviso still
- constitutes a serious impediment to the ultimate goal of
- both publishers and users of CD-ROMs, namely
- "interoperability."
-
- 1.4 Interoperability
-
- The concept of "interoperability" is simple.
- Interoperability is the desire to be able to purchase any
- CD-ROM title and be able to access it on any CD-ROM
- drive, using any microcomputer system, operating under
- any operating system. Interoperability would enable
- publishers to master one disc for all markets.
- Interoperability would enable any user to buy any disc
- with the assurance that it will operate on his or her
- microcomputer system.
- However, the above definition of interoperability
- does not satisfy all the problems currently being
- experienced by CD-ROM publishers and users. Another very
- serious problem remains; lack of a common interface by
- which a user can access the data on any CD-ROM.
-
- 1.5 The Proprietary World of CD-ROM User Interfaces
-
- For a CD-ROM title to be useful, the user must be
- able to access the data on it. If every user only buys
- one CD-ROM title, this would be a trivial problem. The
- user would learn how to operate the data retrieval system
- provided by the publisher of that one title. However,
- CD-ROM is a mass publishing medium. Therefore, it is
- likely that most CD-ROM users will access more than one
- CD-ROM title. Therein lies the problem.
- Unlike the situation for microcomputer systems and
- operating systems, there are hundreds of different data
- storage and retrieval systems being used on CD-ROM
- titles. So instead of having to learn just one retrieval
- system, multi-title users are forced to learn many, in
- fact, one per title in many instances. It is either an
- inconvenience, a nuisance, or an impossibility for users
- to be forced to master many different retrieval systems.
- What each user wants is to be able to access the data on
- a CD-ROM using one accessing mechanism (i.e., the user
- interface) of his or her choice.
- Although there are many different data storage and
- user interface systems in use, the retrieval components
- of these systems fall into about a half-dozen different
- generic categories. These are:
-
- 1. Command Driven Interfaces (e.g., SQL)
- 2. Menu Interfaces (e.g., Lotus 1-2-3)
- 3. Function Key Interfaces (e.g., WordPerfect)
- 4. Graphic User Interfaces (e.g., Macintosh OS,
- MS Windows)
- 5. Point-and-Shoot & Hypertext
-
- There are a few other types of interfaces, such as
- touch-screen and voice, but they are not nearly as
- pervasive as the five listed above, at least for
- accessing data on CD-ROM. Moreover, there is no clear
- leader for any of the above five types of interfaces.
- Surveys have shown that interface types 1 through 4 hold
- roughly equivalent shares of user preference. Thus, no
- one type of interface is likely to prevail based on
- current user preferences. However, the problem is not
- intractable.
-
- 1.6 Complete Interoperability
-
- The goal of total interoperability requires an
- expanded definition from that given in Section 1.4 above.
- The definition of "interoperability" should be expanded
- as follows:
-
- Interoperability is the ability to purchase any CD-ROM
- title and be able to access it on any CD-ROM drive, using
- any microcomputer system, operating under any operating
- system, using any data retrieval interface.
-
- In order to achieve the desired state of
- interoperability, as defined above, it is necessary to
- promulgate and implement a third standard for CD-ROM.
- This proposed standard is the subject of this document.
-
- 2. CD-ROM READ-ONLY DATA EXCHANGE STANDARD (CD-RDX)
-
- The CD-RDx standard is commissioned by the Director
- of Central Intelligence of the U.S. for the immediate
- benefit of the U.S. Federal Government and any other
- entities that wish to adopt CD-RDx. The U.S. Government
- is the largest single procurer of microcomputing
- equipment, products, and services in the world. As such,
- the goal of total interoperability for CD-ROMs is
- exceedingly important to the U.S. Government. The
- significant savings in publishing, shipping, and storage
- costs that are associated with CD-ROM can only be
- realized if end users can use the data on them simply,
- efficiently, and inexpensively.
-
- 2.1 Purposes of CD-RDx
-
- Based on the above line of reasoning, the Director
- of Central Intelligence and the Intelligence Community
- Staff are in the process of developing this standard for
- the following purposes:
-
- 1. Foster an environment in which access to data
- on CD-ROM discs would be both:
-
- Systems independent (i.e., functionally
- interoperable across systems)
-
- Software independent (i.e., functionally
- interoperable with any user interface)
-
- 2. Provide a standard with sufficient level of
- detail to guide implementation and minimize
- ambiguity in the production of interoperable
- CD-ROMs.
-
- 3. Provide a standard that is fully compliant, at
- a minimum, with "ISO/IEC Standard 10149" which
- defines the logical blocks of data on a
- compact disc, the "International Standard for
- the CD-ROM Volume and File Format" (ISO 9660),
- the "Standard Generalized Markup Language"
- (SGML) (MIL-STD 28001), the "Open Systems
- Interconnection" (OSI), the "Government Open
- Systems Interconnection Profile" (GOSIP) and
- POSiX (FIPS 151).
-
- 4. Promote long-term storage and exchange of
- information on CD-ROM between and within the
- agencies of the U.S. Government.
-
- 5. Provide a standard that would promote adoption
- of CD-ROM as an acceptable archiving medium.
-
- Regarding this last stated purpose, because CD-ROM data
- is in machine-readable form, easy transfer to any
- subsequent archiving media is assured. For these reasons,
- CD-ROM is considered an appropriate medium for the
- dissemination and archiving of digital data, replacing
- and/or supplementing paper, microfiche, 9-track tapes and
- other digital mass storage devices.
-
- 2.2 The Scope of CD-RDx
-
- At this time, the CD-RDx standard is concentrating
- on implementing only those microcomputing platforms and
- operating systems that are widely used in the U.S.
- Government microcomputing environment. These are as
- follows:
-
- Microcomputing Platforms Operating Systems
-
- IBM PC/PS2 Standard Systems PC/MS-DOS, DOS
- under Windows
- Macintosh Systems Macintosh OS
- Super-Microcomputer Systems UNIX/XENIX
- IBM PS2/80 Systems OS2
-
-
- The DOS operating systems is by far the most common
- operating system encountered in the government
- environment. DOS also presents the most difficult
- challenge for CD-RDx implementation, because of the 640K
- RAM barrier. Because of the exceedingly large base of
- microcomputers running under DOS without any extended
- memory or Disk caching capabilities, the goal of CD-RDx
- implementation is to make interoperability functional
- within the 640K limitation for DOS systems. The other
- operating systems listed above are not nearly as
- constrained as DOS, and should prove relatively easy
- environments in which to implement CD-RDx.
-
- 2.3 The CD-RDx Approach to Data Indexing and Retrieval
-
- Traditionally, the indexing of data into a database
- and the retrieval of data from that database have been
- accomplished by a single database management system
- (DBMS). These DBMSs are characterized by a wide range of
- indexing algorithms and a wide variety of data retrieval
- techniques. Frequently, these DBMSs are highly integrated
- programs, with the programming designed to accomplish all
- the DBMS functions interlaced to a high degree. For
- dynamic databases (i.e., databases where data is being
- added, changed, or deleted regularly), these interlaced
- DBMSs are still most appropriate.
- However, a CD-ROM does not allow the data stored on
- it to be altered in any way. Therefore, many of the
- subroutines and index structures developed for managing
- dynamic databases are of no utility for CD-ROMs. Since no
- database update is possible, the indexes for a CD-ROM can
- by significantly more compact than the indexes for a
- dynamic database. Moreover, since the end user of a
- CD-ROM cannot index any data on it, there is no need to
- include the indexing subroutines of the DBMS on the
- CD-ROM when it is published. Only the data retrieval
- portion of the DBMS is necessary for the end user.
- These characteristics of CD-ROM databases make it
- possible and, indeed, desirable to treat the indexing of
- data and the retrieval of data as two separate and
- distinct functions. By thus separating indexing from
- retrieval, the goal of interoperability is brought within
- reach. The CD-RDx standard is based on this separation of
- indexing from retrieval functions, but takes the
- separation of functions to the next level, the separation
- of user interface from retrieval.
- The separation of user interface from retrieval is
- reflected in CD-RDx by the development of two concepts:
-
- 1. The Client- which is responsible for all
- communication with the user and
- management of the display
- monitor. The Client can be
- logically separated into three
- components: the user interface,
- which manages the screen and
- the user input; the message
- generator, which prepares and
- sends messages to the Server;
- and the data reformatter, which
- takes the data returned by the
- Server and prepares it for the
- user interface. In this
- standard, the term "Client" is
- used to refer to all three
- components.
-
- 2. The Server- which is responsible for all
- the interactions involving
- accessing the data on a CD-ROM.
- The Server consists of two
- basic sections: a non-
- proprietary section, which
- handles three functions
- (program initiation and
- removal, memory management, and
- the parsing of messages from or
- two the Client) and a
- proprietary section, which
- contains the Application
- Program Interface (API) and the
- retrieval functions. The Server
- must also be able to
- communicate directly with the
- operator and the operating
- system when it is being
- initialized as a separate TSR
- program (see Section 3.2.1,
- below).
-
- There may be different Clients and different Servers. The
- Client may or may not be contained on the CD-ROM; all
- Servers must be located on the CD-ROM. Each Server must
- be accessible via a name unique to each operating system.
-
- 2.4 Sequence of Events of a CD-RDx Query
-
- With the CD-RDx standard, a CD-ROM would contain the
- user data, one or more data indexes, and one or more
- database Servers. Data access and retrieval occurs in the
- following manner:
-
- 1. The user enters a data request according to
- the Client program of the user's choice.
-
- 2. The Client, passes a data request to a program
- called a "Server," using the commands and
- protocols defined in the CD-RDx standard.
-
- 3. The Server parses the data request and passes
- that request to its retrieval function
- portion.
-
- 4. The retrieval functions perform the following:
-
- Receive the data request from Server.
-
- Processes the data request against data
- index(es)
- Determines whether or not data on the
- CD-ROM satisfies the data request
-
- Identifies the appropriate response to
- the data request
-
- 5. The Server passes the data to Client, using
- standard commands and protocols defined in the
- CD-RDx standard.
-
- 6. The Client presents the response data to the
- user.
-
- Although a user interface developed specifically for the
- database/retrieval engine may also accompany the CD-ROM
- application, the CD-RDx standard stipulates that any
- Client conforming to the CD-RDx standard must be able to
- access data on a CD-ROM, as described above. In this way,
- the user may select the Client of his or her choice and
- access data on any CD-ROM indexed by an application
- conforming to the CD-RDx standard. The publisher who
- publishes CD-RDx compatible CD-ROMs is assured that such
- CD-ROMs are operable within all operating systems for
- which there are Servers on the CD-ROM and that the data
- on the CD-ROM is accessible to all users who have any
- CD-RDx compliant Client.
-
- 2.5 How CD-RDx Works
-
- The CD-RDx standard defines a set of protocols
- enabling the transfer of CD-ROM data between any CD-RDx
- Client and any CD-RDx Server. The standard consists of a
- set of commands, 2-dimensional tables and a simple
- procedure for their use. CD-RDx replaces the need for
- users to learn a new interface for each CD-ROM
- application. CD-RDx requires that software developers
- construct database Servers for their proprietary
- retrieval engines. CD-RDx allows CD-ROM software
- developers, publishers, or third-party developers to
- create new user interfaces or modify existing ones
- without having to develop a complete DBMS.
- The CD-RDx standard assumes the following for each
- and every CD-ROM produced and used:
-
- Digital data resides on CD-ROMs indexed using
- a proprietary indexing program.
-
- Each CD-ROM database Server is available on
- the CD-ROM for loading into memory on the
- user's system.
-
- Different types of Clients can be developed to
- access and manipulate the digital data
- returned by the Server.
-
- Every Client that implements CD-RDx can access
- any CD-ROM database that implements CD-RDx via
- a database Server.
-
- The CD-RDx Protocol accepts commands and
- responds appropriately without additional user
- or Client involvement.
-
- The database indexing programs can be improved
- and additions made without affecting the
- Client programs. Conversely, the Client
- programs can be improved and additions made
- without affecting the database indexing
- programs. (This concept is known as "backward
- compatibility," wherein the earlier versions
- of programs still work with new versions.)
-
- The CD-RDx Protocol/database combination is a
- "black box" approach used extensively
- throughout the computer industry and is
- therefore known and accepted by software
- developers.
-
- The CD-RDx standard is designed to be run-time
- compatible with the Client.
-
- Communication between the database Server and
- the Client occurs using English-like commands
- in ASCII text format.
-
- The data accessed and retrieved from a CD-ROM
- complying with the CD-RDx standard may be
- alphanumeric, audio, digital graphics, vector
- images, video. The ability to access and
- display each or all of these formats is
- dependent on the capabilities of the Client.
- The ability to retrieve each or all of these
- formats is dependent on the capabilities of
- the Server and/or the indexing system, as well
- as the database design choices made by the
- publisher before the CD-ROM was produced.
-
-
- APPENDIX: QUESTIONS AND ANSWERS
-
- Q. What does CD-RDx involve?
- A. Like many other computer operations,
- the CD-RDx standard incorporates a
- Client/Server approach, meaning that the
- Client (what the user interacts with,
- such as the menus and/or commands for a
- particular software application, command
- structure, graphical user interface or
- pull down menu) is separate from the
- Server (the unique retrieval engine and
- data index requirements). Therefore,
- CD-RDx describes in detail the Black Box
- through which both sides are to
- communicate with one another.
-
- Q. What's in the Black Box?
- A. The Black Box is not a thing; rather
- it is a set of rules that provide a level
- of abstraction so that both sides -- the
- Client and the Server -- can communicate
- with one another. These rules consist of
- commands and 2-dimensional tables as
- described in detail in the CD-RDx
- standard.
-
- Q. What will an organization need to do
- to adhere to this standard?
- A. If an organization is purchasing
- commercially available CD-ROM
- applications or developing a solicitation
- for the delivery of digital data on
- CD-ROM discs, then an explicit statement
- requiring that the discs contain the
- necessary index/retrieval engine drivers
- (Server) conforming to this standard is
- needed.
- If, on the other hand, an
- organization is receiving CD-ROM discs
- with the Server, then the organization
- must have the necessary user interface
- drivers (Client) conforming to the
- standard in order to use the discs.
- Q. Does this mean that an organization
- has to create a new Client for each
- Server?
- A. Absolutely NOT. In fact, quite the
- opposite. Once a Client is created for a
- top-level user interface, such as Windows
- 3.0, Motif, Hypercard, or a software
- application program, such as SAS, SPSS,
- Lotus 1-2-3, Excel, Word, WordPerfect and
- so on, then these types of user
- interfaces will work with ANY CD-ROM disc
- that have the requisite Server.
-
- Q. What will happen to existing CD-ROM
- discs that do not contain a Server?
- A. These discs continue to be useful just
- as they are now; however, unlike those
- discs containing a Server, the existing
- discs will have to be installed with each
- use and the user interfaces will be
- specific to that disc.
-
- Q. Can a user interface now used for a
- specific CD-ROM retrieval engine be used
- for ALL discs containing a Server?
- A. Yes, if the software vendor has
- designed the retrieval engine and user
- interface as separate functions and if
- the user interface has the necessary
- Client drivers.
-
- Q. What are the major implications of the
- CD-RDx standard?
- A. Presently, most CD-ROM publishers
- purchase CD-ROM software based on the
- user interface and the features that the
- user interface provide. With this
- standard, potential publishers will not
- have to concern themselves about a user
- interface, since these decisions will
- reside with users and the user
- interfaces/Client the users select.
- Instead, publishers will be dealing with
- performance of the software, the
- structure of the index(es) and data
- preparation requirements of the software
- program. Although this is exactly what
- potential CD-ROM publishers should be
- concerned with, most software vendors
- would like potential licensees to
- consider less important factors, such as
- sticky notes, printout capabilities,
- Boolean operators and licensing fees.
-
- For CD-ROM publishers, this standard could also move
- the perspective of potential buyers from, again, features
- as observed in the user interface to quality, accuracy
- and completeness of the data. Like a book, CD-ROM should
- convey appropriate, complete and useful information. A
- potential purchaser is now wise enough to detect a
- beautifully bound book with irrelevant information. This
- "next-level-of- intelligence" will occur for CD-ROM
- purchasers and users as well, as a result of this
- standard.
-
- Q. What are the risks? What will users or
- purchasers of CD-ROMs have to give up?
- A. Nothing. The standard provides only
- rewards for users and purchasers. Instead
- of having to learn each software program
- with each new disc, users will select the
- user interface of their choice and
- access data on any one or a combination
- of discs with this single user interface.
- Users may develop a single search and
- then apply it to one or more databases on
- separate discs published by different
- vendors.
-
- Q. What will CD-ROM producers have to do
- to implement this standard?
- A. Assuming that the specific CD-ROM
- indexing and retrieval software selected
- comes with a Server, then the answer is
- nothing different than what is required
- of the program now. The software
- developer/vendor, however, will have to
- create its own Server driver ONCE for
- each OS.
-
- Q. What will CD-ROM users have to do to
- implement this standard?
- A. Obtain or develop a Client driver for
- each top-level or application-specific
- user interface as desired. For example,
- the user may want to access full-text
- information from Windows 3.0, from a word
- processing program, and also from "XYZ's"
- menu. Client drivers will have to be
- developed for each of these three
- programs.
-
- Q. What will the CD-ROM industry have to
- do to implement this standard?
- A. Developers of CD-ROM indexing
- retrieval software will each have to
- develop their own Server drivers to work
- with their proprietary software engines
- and indexing schemes. Software and/or
- third-party developers of top-level
- interfaces and of software applications
- will have to develop Client drivers to
- work with a number of programs.
-
- Q. Does this mean that a user interface
- does not come automatically with a
- CD-ROM application?
- A. Not necessarily. A CD-ROM
- producer/publisher may decide to deliver
- a user interface for the database as well
- as the Server on the same disc. The
- standard does not limit publishers nor
- users from providing or using a user
- interface that comes with the software as
- we are used to at the present time.
-
- Q. What is the proposed course of action
- with respect to implementing the CD-RDx
- standard as an international standard?
- A. The DCI Intelligence Information
- Handling Committee (IHC) agrees on the
- need for a CD-RDx standard for use
- throughout the Intelligence Community.
- Simultaneously, IHC is briefing civilian
- agencies, the Department of Defense,
- NISO/NIST and other groups in the public
- and private sectors on the details of the
- CD-RDx standard. It is also began funding
- the development of Client drivers for
- some "most-used" user interfaces in the
- DOS, UNIX, OS/2 and Macintosh operating
- system environments so that CD-ROM discs
- containing the requisite Server driver
- can be demonstrated to work in different
- operating systems within different user
- interfaces.
-
- Q. Does this CD-RDx standard conflict
- with any of the standardization
- activities within the Computer-Aided
- Acquisition and Logistics Support (CALS)
- system?
- A. The answer is NO.
-
- Q. How does the CD-RDx standard address
- the accessing, retrieval and display of
- graphics?
- A. Images are passed as data to the
- Client. It is up to the Client to support
- a specific type of graphics. In short,
- CD-RDx in no way limits the use of
- graphics, nor the type of graphics.
-
- Q. Are the specifications and the
- internal components of the specifications
- extensible?
- A. Yes. Extensibility is in the data
- display. The data return types include
- extensible/proprietary data types. NOTE:
- Proprietary data types can only be used
- by Clients that understand that type of
- data. (i.e. The Server company's
- Client.)
-
- Q. How will multiple simultaneous
- protocols be defined? Where will the
- protocol handlers be located? Does CD-RDx
- take into account multi-relational,
- multi-file databases?
- A. Implementation of the Server is not
- limited to the CD-RDx standard. If a
- particular CD-ROM retrieval engine
- supports multi-anything, then that
- particular software developer should
- write these capabilities into the Server.
-
- Q. Doesn't a memory resident program,
- such as a Terminate and Stay Resident
- (TSR), limit acceptance of the CD-RDx
- standard?
- A. The primary purpose of CD-RDx is to
- transcend the need for writing a Server
- for each computer or interface
- environment. Without a memory resident
- Server, it would not be a standard
- acceptable in all environments.
-
- Q. Is there any reason why the Database
- Server must be a TSR? There are any
- number of alternative ways to implement
- such a Server, some of which have
- significant advantages over a TSR based
- implementation.
- A. A Terminate and Stay Resident (TSR)
- program may be used, but it is only
- needed in the DOS part of the CD-RDx
- standard. The Server just needs to be an
- independent program in memory.
-
- Q. Where in this CD-RDx standard are
- security issues addressed?
- A. If security and audits are an issue,
- then they may be added to the Server.
- Such functional capabilities are part of
- the Server.
-
- Q. The limits for PATH and FileName in
- the POSiX specification is more liberal
- than those for MS-DOS. Can the length
- limitations be increased?
- A. Clients operating in different
- environments can specify any length for
- path and file names.
-
- Q. How will the user process
- records/fields that are huge, such as 1
- MB or larger?
- A. CD-RDx imposes no limitations at all.
- The Client sets buffer length. The
- Server merely writes to the buffer. The
- buffer may be written to the full
- available size of any storage device
- available. This allows the Server to use
- both RAM and mass storage.
-
- Q. How does CD-RDx work with networks.
- A. The ability to access databases that
- reside on a networked drive is a function
- of the Server. Establishing the link to
- the network is not in the realm of how
- data is passed between the Client and
- Server. When the database retrieval power
- is on a network server, it is the
- responsibility of a "Server Stub" running
- on a local computer to request and
- process the remote information. The
- possible exception to this is in the UNIX
- world where the remote Server has
- established an ID on the CD-RDx Socket.
-
- Q. ISO 9660 has three levels of
- implementation. Does CD-RDx have a
- similar approach to implementation so
- that developers can get "up and running"
- faster?
- A. CD-RDx supports everything and nothing
- can be left out. However, developers of
- Clients and Servers can certainly begin
- with the basics and provide increasing
- levels of sophistication over time. The
- basic data types must be supported if the
- Server has that data available in any of
- the databases it is responsible for.
-
-
-
- SFQL: STRUCTURED FULL-TEXT QUERY LANGUAGE
-
- Dr Neil R. Shapiro
- Scilab Inc., Niskayuna NY
-
-
- This paper was done with overhead visuals listed below.
-
- %g SHAP01.PCX
- %g SHAP02.PCX
- %g SHAP03.PCX
- %g SHAP04.PCX
- %g SHAP05.PCX
- %g SHAP06.PCX
- %g SHAP07.PCX
- %g SHAP08.PCX
- %g SHAP09.PCX
- %g SHAP10.PCX
- %g SHAP11.PCX
- %g SHAP12.PCX
- %g SHAP13.PCX
- %g SHAP14.PCX
- %g SHAP15.PCX
- %g SHAP16.PCX
-
-
-
-
- THE CD-ROM STANDARDS FRONTIER: ROCK RIDGE
-
- Andrew Young
- President, Young Minds Inc.
-
-
- The CD-ROM industry is growing by leaps and bounds. In
- 1990, revenues exceeded the one billion dollar mark and
- the installed base surpassed one million drives. The
- advantages offered by this low-cost, high-capacity medium
- are motivating leading-edge companies in all segments of
- the desktop computing industry to utilize CD-ROM for
- their information and data-intensive applications, as
- well as enabling the establishment of entirely new
- categories of software and information publishing.
- Different types of applications are driving the
- proliferation of CD-ROM on different computing platforms.
- Self-contained reference, educational and entertainment
- applications account for the majority of CD-ROM
- publications for PC compatibles. Hypercard stacks,
- collections of fonts, images or clip art and multimedia
- applications are common on Macintosh discs.
- Scientific data, technical documentation and
- distribution of large software packages comprise the
- primary areas of CD-ROM application on UNIX workstations.
- Currently, the vast majority of the CD-ROM publishing
- activity is still concentrated in the PC-compatible
- market. Clearly, the advantages offered by CD-ROM
- technology benefit other categories of computing
- machinery as much as they do PCs. What's the hang-up?
-
-
- ISO 9660
-
- Ironically, one of the biggest barriers to widespread use
- of CD-ROM technology on non-DOS based systems is one of
- the biggest reasons that CD-ROM has been successful in
- the PC environment. In 1988, the International Standards
- Organization (ISO) established ISO 9660 as the standard
- for recording volume, directory and file information on
- CD-ROM discs. The format is independent of the type of
- computer or operating system used to read the CD-ROM
- disc, allowing a single disc to be accessed by many
- different types of computer systems.
- Measured in terms of market acceptance, the ISO 9660
- standard disc format has been enormously successful.
- Virtually every major desktop computer system can be
- outfitted with a CD-ROM drive and driver software to read
- ISO 9660 formatted discs; a larger variety of different
- makes and models of computers can read ISO 9660 discs
- than any other single disc format.
- In its effort to achieve platform-independent
- information interchange, the ISO 9660 standard adopted a
- very simple, "lowest common denominator" format. The base
- (ISO 9660 Level 1) format imposes restrictions which are
- very similar to those of (actually even more strict than)
- an MS-DOS directory structure. Filenames must be short,
- PC-style names, using only upper case letters, digits and
- the underscore character. Directory names are limited to
- eight characters. Though the filename length limit is
- somewhat softened for ISO 9660 Levels 2 and 3, the format
- still forbids lower case and special characters.
- Directory depths cannot exceed eight levels (a
- rather common occurence under UNIX). File and directory
- ownership and access permission information is handled
- only through Extended Attribute Records (XARs) which can
- reduce performance to unacceptable levels. Further, the
- special file types and attributes supported and often
- required for proper application operation, under UNIX and
- other advanced operating systems are not even recognized
- by the ISO 9660.
- The file naming conventions and other information
- stored by the ISO 9660 format is so similar to those
- supported by the MS-DOS file system that the PC market
- has enthusiastically embraced the standard. Yet, it is
- because of the DOS-style restrictions imposed by the ISO
- 9660 that many discs intended for non-DOS environments
- are not complying with this standard.
- Many Macintosh discs aare formatted using Apple's
- Hierarchical File System (HFS) format. Discs created for
- use under the UNIX operating system often use some form
- of the UNIX File System (UFS) format. UFS can differ
- sufficiently from one UNIX system to the next to prohibit
- the use of UFS format discs on more than one product
- line, even from the same manufacturer.
- Even the High Sierra format, which was the
- predecessor of the ISO 9660, is still being used by some
- regressive American CD-ROM disc publishers. Only a very
- few changes were made before it was adopted as the ISO
- 9660. Virtually every CD-ROM driver, except pre-1989 (and
- thus pre-ISO 9660) versions of MSCDEX from Microsoft, can
- read both the High Sierra and ISO 9660 formats. Though a
- much less serious problem than those posed by the use of
- the Mac HFS or UNIX UFS formats, the continued use of the
- High Sierra format is short-sighted and should be
- discouraged.
-
- ROCK RIDGE INTERCHANGE PROTOCOL
-
- Similarly, whenever a widely supported standard such as
- the ISO 9660 can reasonably be used, either alone or as
- the base for an open, extended format, use of any other,
- incopmatible approach should be discouraged. The
- potential abandonment of the ISO 9660 by the large,
- rapidly-growing and traditionally standards-based UNIX
- market is particularly disturbing.
- In an effort to address this concern, several major
- UNIX and CD-ROM vendors created an ad hoc industry
- committee to develop a solution. This committee, taking
- the name Rock Ridge Group from a fake town in the Mel
- Brooks film Blazing Saddles set out to define an ISO
- 9660-compliant mechanism for recording complete
- UNIX/POSIX file system information. The resulting
- specification is known as the Rock Ridge Interchange
- Protocol (RRIP).
- The ISO 9660 provides a System Use Area within its
- directory record structure for recording user-definable,
- system-specific information, yet no standard mechanism or
- model for the use of this space was provided. In the
- process of developing the RRIP, the Rock Ridge Group
- formulated a second proposal, the System Use Sharing
- Protocol (SUSP) to standardize the use of this area. As
- the name suggests, the SUSP provides an extensible
- mechanism by which the System Use Area can be shared by
- more than one set of system-specific extensions at once.
- The SUSP records non-ISO 9660 information using
- System Use Fields. These fields are identified using a
- two-byte Signature, which is followed by a one-byte field
- length, a one-byte field version number and a variable
- length, optional data area. The contents of the data
- areadepends on the specific System Use Field in question.
- If a receiving system understands the structure of SUSP
- fields but does not recognize a specific field on the
- disc, the consistent recording of a field length allows
- the sysetm to skip to the next field and continue. Using
- this mechanism, sets of System Use Fields defined by
- various groups for different systems can be recorded on
- the same disc. The receiving system will simply look for
- the fields it understands.
- The one-byte field length limits the data area to
- 251 bytes (255 minus the field header) but this can be
- effectively extended by defining and using flags in the
- field data area specifying whether and perhaps how,
- extended field data is recorded. Such methods are
- utilized by the RRIP to record particularly long
- filenames or paths for symbolic links, either of which
- might exceed the 251 bytes available for recording data
- in a single field.
- The SUSP itself provides specifications for several
- generic field types. One field indicates to the receiving
- system that there is data on the disc which was recorded
- utilizing the SUSP. It also states where in the System
- Use Area the recording of SUSP information starts.
- Another field indicates the end of the use of the SUSP in
- a given System Use Area.
- One field allows information to be provided about
- the nature and source of specifications for the sets of
- SUSP-compliant extensions utilized on the disc. There is
- a field which allows the disc publishers to define
- portions of the System Use Area to be ignored by the
- receiving system. Since a single field may be larger than
- the available System Use Area in a directory record, the
- remaining SUSP field allows virtually unlimited
- continuation of a System use Area.
- Support for all other specific capabilities is left
- to documents such as the RRIP, which use the SUSP as a
- style-guide to provide system-specific capabilities by
- defining collections of System Use Fields. Note that the
- SUSP itself is platform-independent; only the RRIP is
- POSIX-specific. Anyone with a need to record any type of
- system-specific information within an ISO 9660 directory
- structure is encouraged to consider using the SUSP.
- As previously mentioned, the focus of the RRIP is to
- provide support for UNIX/POSIX file systems. The RRIP
- does this by defining fields for recording all the extra
- information required by UNIX systems which are not
- covered by the ISO 9660. In addition, user and group IDs
- and permissions, which are recorded in the ISO 9660 XARs,
- are recorded in an RRIP System Use Field, substantially
- improving system performance.
- The problem of handling directory trees deeper than
- eight levels, prohibited by the ISO 9660 but all too
- common under UNIX, is resolved by using fields which
- enable the transparent encoding of a "virtual tree." This
- tree is accessible to users of RRIP-compliant CD-ROM
- subsystems but is completely invisible to users of
- standard ISO 9660 systems.
- Many UNIX software and hardware vendors have
- postponed taking advantage of the cost-saving potential
- of CD-ROM because of the restrictions imposed by the ISO
- 9660 or the effort required to convert their products to
- operate correctly. Use of the RRIP for creation of CD-ROM
- discs for distributing software, documentation, data or
- training materials will eliminate many of the barriers to
- the use of CD-ROM for UNIX applications.
- Another possible impediment to acceptance of a new
- technology is the existence of standards at the physical
- and logical levels for the device. This is particularly
- true in the standards-oriented, UNIX market. POSIX is
- itself a standardization of much of the fundamental,
- functional aspects of the UNIX operating system.
- Fortunately, formal standards exist for every major facet
- of CD-ROM technology.
-
-
- THE STANDARDS PROCESS
-
- The Rock Ridge Group recognized immediately the
- importance of seeking both industry support and formal
- standardization for any proposals they would develop and
- included the pursuit of these in their original goals
- document. In this direction, members of the Rock Ridge
- Group have been actively seeking a broad base of industry
- support for their proposals, speaking at meetings and
- conferences, writing articles and press releases and
- presenting the SUSP and RRIP specifications to major
- standards-developing organizations in the United States.
- The National Institute for Standards and Technology
- (NIST) is currently drafting a proposed Federal
- Information Processing Standard (FIPS) based on the ISO
- 9660, the SUSP and the RRIP. On the international front,
- the IEEE Computer Society is proceeding with plans to
- advance the SUSP and RRIP for adoption as international
- standards by the International Standards Organization
- (ISO) which formalized the ISO 9660. Simultaneously, the
- Rock Ridge Group is actively solicting adoption of the
- SUSP and RRIP by major UNIX vendor organizations,
- consortia and standards groups.
- Though formal standards are critical to the UNIX
- market, many vendors feel that, in this case, they cannot
- wait the minimum of two years which formal, international
- bodies take to adopt a new standard. Companies are often
- driven by market pressures into taking the initiative,
- adopting and using formats before they become official
- standards. Many UNIX workstation, software or operating
- system vndors are already developing and even releasing
- products which incorporate support for the Rock Ridge
- protocols.
- Sun Microsystems expects to distribute an ISO 9660
- CD-ROM driver integrating Rock Ridge support in their
- next major operating system release due this fall.
- Hewlett-Packard, Digital Equipment Corporation, Santa
- Cruz Operation, Interactive and many other UNIX vendors
- have already endorsed the Rock Ridge format and RRIP-
- supporting products from these vendors are expected to
- begin appearing this year.
- Young Minds, Inc. is already shipping a version of
- their Makedisc ISO 9660 formatting software which will
- create discs using Rock Ridge extensions. Discs created
- on any of the major UNIX platforms for which this product
- is available will appear as a UNIX-style file system on
- any UNIX system providing Rock Ridge support when
- mounted. Rock Ridge CD-ROM disc publications created
- using the Makedisc software are already available from
- the Sun User Group and Young Minds. Third party CD-ROM
- drivers supporting the SUSP and RRIP are under
- development, or may already be available for many UNIX
- platforms from Young Minds and other UNIX software
- developers.
- Through the SUSP and RRIP, the Rock Ridge Group has
- effectively addressed the need for better support for
- UNIX file systems on CD-ROM while eliminating the
- pressures which were building within the UNIX workstation
- market to dump the ISO 9660 and start from scratch. The
- CD-ROM industry will benefit from the continued
- consolidation around the ISO 9660 as the one base
- standard on which CD-ROMs should be published. The UNIX
- market will benefit by being able to conveniently use CD-
- ROM technology while retaining the ability to exchange
- data with with most major types of computers.
-
-
-
-
-